Learning a Log-Linear Model with Bilingual Phrase-Pair Features for Statistical Machine Translation
نویسندگان
چکیده
منابع مشابه
Improving Phrase-Based SMT Using Cross-Granularity Embedding Similarity
The phrase–based statistical machine translation (PBSMT) model can be viewed as a log-linear combination of translation and language model features. Such a model typically relies on the phrase table as the main resource for bilingual knowledge, which in its most basic form consists of aligned phrases, along with four probability scores. These scores only indicate the cooccurrence of phrase pair...
متن کاملLearning Bilingual Projections of Embeddings for Vocabulary Expansion in Machine Translation
We propose a simple log-bilinear softmaxbased model to deal with vocabulary expansion in machine translation. Our model uses word embeddings trained on significantly large unlabelled monolingual corpora and learns over a fairly small, wordto-word bilingual dictionary. Given an out-of-vocabulary source word, the model generates a probabilistic list of possible translations in the target language...
متن کاملThe ISL Statistical Machine Translation System for the TC-STAR Spring 2006 Evaluation
In this paper we describe the ISL statistical machine translation system used in the TC-STAR Spring 2006 Evaluation campaign. This system is based on PESA phrase-to-phrase translations which are extracted from a bilingual corpus. The translation model, language model and other features are combined in a log-linear model during decoding. We participated in the Spanish Parliament (Cortes) and Eur...
متن کاملLearning Bilingual Linguistic Reordering Model for Statistical Machine Translation
In this paper, we propose a method for learning reordering model for BTG-based statistical machine translation (SMT). The model focuses on linguistic features from bilingual phrases. Our method involves extracting reordering examples as well as features such as part-of-speech and word class from aligned parallel sentences. The features are classified with special considerations of phrase length...
متن کاملImprove Statistical Machine Translation with Context-Sensitive Bilingual Semantic Embedding Model
We investigate how to improve bilingual embedding which has been successfully used as a feature in phrase-based statistical machine translation (SMT). Despite bilingual embedding’s success, the contextual information, which is of critical importance to translation quality, was ignored in previous work. To employ the contextual information, we propose a simple and memory-efficient model for lear...
متن کامل